To Catch a Chorus: Using Chroma-based Representations for Audio Thumbnailing

نویسندگان

  • Mark A. Bartsch
  • Gregory H. Wakefield
چکیده

An important application for use with multimedia databases is a browsing aid, which allows a user to quickly and efficiently preview selections from either a database or from the results of a database query. Methods for facilitating browsing, though, are necessarily media dependent. We present one such method that produces short, representative samples (or “audio thumbnails”) of selections of popular music. This method attempts to identify the chorus or refrain of a song by identifying repeated sections of the audio waveform. A reduced spectral representation of the selection based on a chroma transformation of the spectrum is used to find repeating patterns. This representation encodes harmonic relationships in a signal and thus is ideal for popular music, which is often characterized by prominent harmonic progressions. The method is evaluated over a sizable database of popular music and found to perform well, with most of the errors resulting from songs that do not meet our structural assumptions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chorus Detection in Songs of Pop Music

This contribution addresses the problem of chorus detection foraudio thumbnailing. We utilise the similarity matrix in order to locate within a pop song the position of its chorus (to be the audio thumbnail). Our focus lies on the problem of analysing the similarity matrix. We do so by filtering its elements including methods known from image processing and using the resulting data. The algorit...

متن کامل

Chorus Detection with Combined Use of Mfcc and Chroma Features and Image Processing Filters

A computationally efficient method for detecting a chorus section in popular and rock music is presented. The method utilizes a distance matrix representation that is obtained by summing two separate distance matrices calculated using the mel-frequency cepstral coefficient and pitch chroma features. The benefit of computing two separate distance matrices is that different enhancement operations...

متن کامل

Understanding Features and Distance Functions for Music Sequence Alignment

We investigate the problem of matching symbolic representations directly to audio based representations for applications that use data from both domains. One such application is score alignment, which aligns a sequence of frames based on features such as chroma vectors and distance functions such as Euclidean distance. Good representations are critical, yet current systems use ad hoc constructi...

متن کامل

Audio Thumbnailing Using MPEG-7 Low Level Audio Descriptors

In this paper we present an audio thumbnailing technique based on audio segmentation by similarity search. The segmentation is performed on MPEG-7 low level audio feature descriptors as a growing source of multimedia meta data. Especially for database applications or audio-on-demand services this technique could be very helpful, because there is no need to have access to the probably copyright ...

متن کامل

A Segment-Based Fitness Measure for Capturing Repetitive Structures of Music Recordings

In this paper, we deal with the task of determining the audio segment that best represents a given music recording (similar to audio thumbnailing). Typically, such a segment has many (approximate) repetitions covering large parts of the music recording. As main contribution, we introduce a novel fitness measure that assigns to each segment a fitness value that expresses how much and how well th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001